Dynameomics: a multi-dimensional analysis-optimized database for dynamic protein data.

نویسندگان

  • Catherine Kehl
  • Andrew M Simms
  • Rudesh D Toofanny
  • Valerie Daggett
چکیده

The Dynameomics project is our effort to characterize the native-state dynamics and folding/unfolding pathways of representatives of all known protein folds by way of molecular dynamics simulations, as described by Beck et al. (in Protein Eng. Des. Select., the first paper in this series). The data produced by these simulations are highly multidimensional in structure and multi-terabytes in size. Both of these features present significant challenges for storage, retrieval and analysis. For optimal data modeling and flexibility, we needed a platform that supported both multidimensional indices and hierarchical relationships between related types of data and that could be integrated within our data warehouse, as described in the accompanying paper directly preceding this one. For these reasons, we have chosen On-line Analytical Processing (OLAP), a multi-dimensional analysis optimized database, as an analytical platform for these data. OLAP is a mature technology in the financial sector, but it has not been used extensively for scientific analysis. Our project is further more unusual for its focus on the multidimensional and analytical capabilities of OLAP rather than its aggregation capacities. The dimensional data model and hierarchies are very flexible. The query language is concise for complex analysis and rapid data retrieval. OLAP shows great promise for the dynamic protein analysis for bioengineering and biomedical applications. In addition, OLAP may have similar potential for other scientific and engineering applications involving large and complex datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dynameomics: mass annotation of protein dynamics and unfolding in water by high-throughput atomistic molecular dynamics simulations.

The goal of Dynameomics is to perform atomistic molecular dynamics (MD) simulations of representative proteins from all known folds in explicit water in their native state and along their thermal unfolding pathways. Here we present 188-fold representatives and their native state simulations and analyses. These 188 targets represent 67% of all the structures in the Protein Data Bank. The behavio...

متن کامل

Dynamic Analysis of Multi-Directional Functionally Graded Panels and Comparative Modeling by ANN

In this paper dynamic analysis of multi-directional functionally graded panel is studied using a semi-analytical numerical method entitled the state-space based differential method (SSDQM) and comparative behavior modeling by artificial neural network (ANN) for different parameters. A semi-analytical approach which makes use the three-dimensional elastic theory and assuming the material propert...

متن کامل

Dynameomics: a comprehensive database of protein dynamics.

The dynamic behavior of proteins is important for an understanding of their function and folding. We have performed molecular dynamics simulations of the native state and unfolding pathways of over 2000 protein/peptide systems (approximately 11,000 independent simulations) representing the majority of folds in globular proteins. These data are stored and organized using an innovative database a...

متن کامل

A Step Towards Modeling and Destabilizing Human Trafficking Networks Using Machine Learning Methods

Human trafficking is a multi-dimensional problem for which we have incomplete data, limited knowledge of the exploiters, and no understanding of the dynamics of the process. It is a problem that requires a larger, more complete database, understanding of key actors and their interactions in a dynamic environment. These methods exist in the areas of Data Mining, Machine Learning, Network Analysi...

متن کامل

Robust high-dimensional semiparametric regression using optimized differencing method applied to the vitamin B2 production data

Background and purpose: By evolving science, knowledge, and technology, we deal with high-dimensional data in which the number of predictors may considerably exceed the sample size. The main problems with high-dimensional data are the estimation of the coefficients and interpretation. For high-dimension problems, classical methods are not reliable because of a large number of predictor variable...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Protein engineering, design & selection : PEDS

دوره 21 6  شماره 

صفحات  -

تاریخ انتشار 2008